The OKPU System in NTCIR11 MedNLP2: An IR Approach to ICD-10 Code Identification
نویسندگان
چکیده
This paper describes an IR (Information Retrieval) approach to identifying the ICD-10 code of a medical term, such as a disease name or a description of a symptom or a complaint), in a medical text. In this approach, we prepare a dictionary of disease names, each paired with a corresponding ICD-10 code(s). The system searches for the disease name most relevant to the input, and returns the ICD-10 code paired with the disease name in the dictionary. In IR terms, disease name in the dictionary can be regarded as a document and an input medical term as a query. In order to handle an input which does not exactly match with any disease names in the database, we introduce two kinds of partial matching and a context search, where a query includes context words of the input term. Preliminary evaluation for the MedNLP2 test set shows that with this simple approach our system correctly identified 54% of the input medical terms.
منابع مشابه
Incorporating Unsupervised Features into CRF based Named Entity Recognition
We participated in the extraction of complaint and diagnosis Task and the normalization of complaint and diagnosis Task of MedNLP2 in NTCIR11. In the extraction Task, we use CRF based Named Entity Recognition method. Moreover, we incorporate unsupervised features learned from raw corpus into CRF. We show such unsupervised features improve system performance.
متن کاملPreliminary Report of III&CYUT for NTCIR-11 MedNLP-2
We construct a supervised learning system to participate MedNLP2 task in NTCIR-11 that find the keyword out correctly at right position and normalize to identify unique id in ICD10 [4]. In our system, We pick part-of-speech tagging (POS) [1] as feature to train machine learning models based on Conditional Random Fields (CRF) [3] for named entities extraction, then construct a hierarchical class...
متن کاملRisky Pollution Index: An Integrated Approach Towards Determination of Metallic Pollution Risk in Sediments
In contrast with Mobility Factor (MF) and Risk Assessment Code (RAC) indices, IR attributes a risk share to metal species bound to reducible and oxidizable phases which are totally neglected in both of the two above-mentioned indices. In other words, besides the absolutely mobile fractions, the potentially mobile ones are also regarded in risk evaluation process elaborated by IR. The different ...
متن کاملRobust Distributed Source Coding with Arbitrary Number of Encoders and Practical Code Design Technique
The robustness property can be added to DSC system at the expense of reducing performance, i.e., increasing the sum-rate. The aim of designing robust DSC schemes is to trade off between system robustness and compression efficiency. In this paper, after deriving an inner bound on the rate–distortion region for the quadratic Gaussian MDC based RDSC system with two encoders, the structure of...
متن کاملMATLAB CODE FOR AN ENHANCED VIBRATING PARTICLES SYSTEM ALGORITHM
Vibrating particles system (VPS) is a new meta-heuristic algorithm based on the free vibration of freedom system’ single degree with viscous damping. In this algorithm, each agent gradually approach to its equilibrium position; new agents are generated according to current agents and a historically best position. Enhanced vibrating particles system (EVPS) employs a new alternative procedu...
متن کامل